GPT models were first launched by OpenAI in 2018. All GPT models use the transformer architecture for natural language processing tasks such as: • Language generation • Translation • Question and answering
![[Pasted image 20230219144933.png]]
## Generative
Generative refers to the model's ability to **generate new text** based on patterns it has learned from its training data.
A generative language model like GPT is capable of producing coherent text in response to a prompt rather than selecting a predefined response.
## Pre-Trained
Pre-trained means that the model has **already been trained** on a large amount of text data before it is fine-tuned for specific tasks. This allows it to learn faster and have better results than starting from scratch
## [[Transformers]]
The transformer is the architecture used in [[Generative Pretrained Transformers - GPT]] models. It is a type of neural network that has become the gold-standard architecture. Unlike [[Recurrent Neural Networks - RNNs]], a transformer can effectively process long sequences of text without losing information.